99 research outputs found
Learning representations for binary-classification without backpropagation
The family of feedback alignment (FA) algorithms aims to provide a more biologically motivated alternative to backpropagation (BP), by substituting the computations that are unrealistic to be implemented in physical brains. While FA algorithms have been shown to work well in practice, there is a lack of rigorous theory proofing their learning capabilities. Here we introduce the first feedback alignment algorithm with provable learning guarantees. In contrast to existing work, we do not require any assumption about the size or depth of the network except that it has a single output neuron, i.e., such as for binary classification tasks. We show that our FA algorithm can deliver its theoretical promises in practice, surpassing the learning performance of existing FA methods and matching backpropagation in binary classification tasks. Finally, we demonstrate the limits of our FA variant when the number of output neurons grows beyond a certain quantity
Response Characterization for Auditing Cell Dynamics in Long Short-term Memory Networks
In this paper, we introduce a novel method to interpret recurrent neural
networks (RNNs), particularly long short-term memory networks (LSTMs) at the
cellular level. We propose a systematic pipeline for interpreting individual
hidden state dynamics within the network using response characterization
methods. The ranked contribution of individual cells to the network's output is
computed by analyzing a set of interpretable metrics of their decoupled step
and sinusoidal responses. As a result, our method is able to uniquely identify
neurons with insightful dynamics, quantify relationships between dynamical
properties and test accuracy through ablation analysis, and interpret the
impact of network capacity on a network's dynamical distribution. Finally, we
demonstrate generalizability and scalability of our method by evaluating a
series of different benchmark sequential datasets
LNCS
Quantization converts neural networks into low-bit fixed-point computations which can be carried out by efficient integer-only hardware, and is standard practice for the deployment of neural networks on real-time embedded devices. However, like their real-numbered counterpart, quantized networks are not immune to malicious misclassification caused by adversarial attacks. We investigate how quantization affects a network’s robustness to adversarial attacks, which is a formal verification question. We show that neither robustness nor non-robustness are monotonic with changing the number of bits for the representation and, also, neither are preserved by quantization from a real-numbered network. For this reason, we introduce a verification method for quantized neural networks which, using SMT solving over bit-vectors, accounts for their exact, bit-precise semantics. We built a tool and analyzed the effect of quantization on a classifier for the MNIST dataset. We demonstrate that, compared to our method, existing methods for the analysis of real-numbered networks often derive false conclusions about their quantizations, both when determining robustness and when detecting attacks, and that existing methods for quantized networks often miss attacks. Furthermore, we applied our method beyond robustness, showing how the number of bits in quantization enlarges the gender bias of a predictor for students’ grades
Scalable Verification of Quantized Neural Networks (Technical Report)
Formal verification of neural networks is an active topic of research, and
recent advances have significantly increased the size of the networks that
verification tools can handle. However, most methods are designed for
verification of an idealized model of the actual network which works over real
arithmetic and ignores rounding imprecisions. This idealization is in stark
contrast to network quantization, which is a technique that trades numerical
precision for computational efficiency and is, therefore, often applied in
practice. Neglecting rounding errors of such low-bit quantized neural networks
has been shown to lead to wrong conclusions about the network's correctness.
Thus, the desired approach for verifying quantized neural networks would be one
that takes these rounding errors into account. In this paper, we show that
verifying the bit-exact implementation of quantized neural networks with
bit-vector specifications is PSPACE-hard, even though verifying idealized
real-valued networks and satisfiability of bit-vector specifications alone are
each in NP. Furthermore, we explore several practical heuristics toward closing
the complexity gap between idealized and bit-exact verification. In particular,
we propose three techniques for making SMT-based verification of quantized
neural networks more scalable. Our experiments demonstrate that our proposed
methods allow a speedup of up to three orders of magnitude over existing
approaches
Dataset Distillation with Convexified Implicit Gradients
We propose a new dataset distillation algorithm using reparameterization and
convexification of implicit gradients (RCIG), that substantially improves the
state-of-the-art. To this end, we first formulate dataset distillation as a
bi-level optimization problem. Then, we show how implicit gradients can be
effectively used to compute meta-gradient updates. We further equip the
algorithm with a convexified approximation that corresponds to learning on top
of a frozen finite-width neural tangent kernel. Finally, we improve bias in
implicit gradients by parameterizing the neural network to enable analytical
computation of final-layer parameters given the body parameters. RCIG
establishes the new state-of-the-art on a diverse series of dataset
distillation tasks. Notably, with one image per class, on resized ImageNet,
RCIG sees on average a 108% improvement over the previous state-of-the-art
distillation algorithm. Similarly, we observed a 66% gain over SOTA on
Tiny-ImageNet and 37% on CIFAR-100
Learning Control Policies for Stochastic Systems with Reach-avoid Guarantees
We study the problem of learning controllers for discrete-time non-linear
stochastic dynamical systems with formal reach-avoid guarantees. This work
presents the first method for providing formal reach-avoid guarantees, which
combine and generalize stability and safety guarantees, with a tolerable
probability threshold over the infinite time horizon. Our method
leverages advances in machine learning literature and it represents formal
certificates as neural networks. In particular, we learn a certificate in the
form of a reach-avoid supermartingale (RASM), a novel notion that we introduce
in this work. Our RASMs provide reachability and avoidance guarantees by
imposing constraints on what can be viewed as a stochastic extension of level
sets of Lyapunov functions for deterministic systems. Our approach solves
several important problems -- it can be used to learn a control policy from
scratch, to verify a reach-avoid specification for a fixed control policy, or
to fine-tune a pre-trained policy if it does not satisfy the reach-avoid
specification. We validate our approach on stochastic non-linear
reinforcement learning tasks.Comment: Accepted at AAAI 202
Liquid Time-constant Networks
We introduce a new class of time-continuous recurrent neural network models.
Instead of declaring a learning system's dynamics by implicit nonlinearities,
we construct networks of linear first-order dynamical systems modulated via
nonlinear interlinked gates. The resulting models represent dynamical systems
with varying (i.e., liquid) time-constants coupled to their hidden state, with
outputs being computed by numerical differential equation solvers. These neural
networks exhibit stable and bounded behavior, yield superior expressivity
within the family of neural ordinary differential equations, and give rise to
improved performance on time-series prediction tasks. To demonstrate these
properties, we first take a theoretical approach to find bounds over their
dynamics and compute their expressive power by the trajectory length measure in
latent trajectory space. We then conduct a series of time-series prediction
experiments to manifest the approximation capability of Liquid Time-Constant
Networks (LTCs) compared to classical and modern RNNs. Code and data are
available at https://github.com/raminmh/liquid_time_constant_networksComment: Accepted to the Thirty-Fifth AAAI Conference on Artificial
Intelligence (AAAI-21
- …